Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 18249 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.6 MiB |
| Average record size in memory | 93.3 B |
Variable types
| NUM | 10 |
|---|---|
| CAT | 5 |
| DATE | 1 |
region has a high cardinality: 54 distinct values | High cardinality |
4046 is highly correlated with total volume and 3 other fields | High correlation |
total volume is highly correlated with 4046 and 3 other fields | High correlation |
4225 is highly correlated with total volume and 3 other fields | High correlation |
total bags is highly correlated with total volume and 4 other fields | High correlation |
small bags is highly correlated with total volume and 4 other fields | High correlation |
large bags is highly correlated with total bags and 1 other fields | High correlation |
region is uniformly distributed | Uniform |
df_index has 432 (2.4%) zeros | Zeros |
4046 has 242 (1.3%) zeros | Zeros |
4770 has 5498 (30.1%) zeros | Zeros |
large bags has 2371 (13.0%) zeros | Zeros |
xlarge bags has 12048 (66.0%) zeros | Zeros |
Reproduction
| Analysis started | 2020-09-14 16:09:13.672094 |
|---|---|
| Analysis finished | 2020-09-14 16:09:36.326305 |
| Duration | 22.65 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 53 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.2322319 |
|---|---|
| Minimum | 0 |
| Maximum | 52 |
| Zeros | 432 |
| Zeros (%) | 2.4% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 10 |
| median | 24 |
| Q3 | 38 |
| 95-th percentile | 49 |
| Maximum | 52 |
| Range | 52 |
| Interquartile range (IQR) | 28 |
Descriptive statistics
| Standard deviation | 15.48104475 |
|---|---|
| Coefficient of variation (CV) | 0.6388616953 |
| Kurtosis | -1.254364272 |
| Mean | 24.2322319 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 0.1083337271 |
| Sum | 442214 |
| Variance | 239.6627467 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 7 | 432 | 2.4% | |
| 11 | 432 | 2.4% | |
| 1 | 432 | 2.4% | |
| 2 | 432 | 2.4% | |
| 3 | 432 | 2.4% | |
| 4 | 432 | 2.4% | |
| 5 | 432 | 2.4% | |
| 6 | 432 | 2.4% | |
| 8 | 432 | 2.4% | |
| 9 | 432 | 2.4% | |
| Other values (43) | 13929 | 76.3% |
| Value | Count | Frequency (%) | |
| 0 | 432 | 2.4% | |
| 1 | 432 | 2.4% | |
| 2 | 432 | 2.4% | |
| 3 | 432 | 2.4% | |
| 4 | 432 | 2.4% |
| Value | Count | Frequency (%) | |
| 52 | 107 | 0.6% | |
| 51 | 322 | 1.8% | |
| 50 | 324 | 1.8% | |
| 49 | 324 | 1.8% | |
| 48 | 324 | 1.8% |
date
Date
| Distinct | 169 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 142.6 KiB |
| Minimum | 2015-01-04 00:00:00 |
|---|---|
| Maximum | 2018-03-25 00:00:00 |
Histogram with fixed size bins (bins=50)
averageprice
Real number (ℝ≥0)
| Distinct | 259 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.40597841 |
|---|---|
| Minimum | 0.44 |
| Maximum | 3.25 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0.44 |
|---|---|
| 5-th percentile | 0.83 |
| Q1 | 1.1 |
| median | 1.37 |
| Q3 | 1.66 |
| 95-th percentile | 2.11 |
| Maximum | 3.25 |
| Range | 2.81 |
| Interquartile range (IQR) | 0.56 |
Descriptive statistics
| Standard deviation | 0.4026765555 |
|---|---|
| Coefficient of variation (CV) | 0.2864030861 |
| Kurtosis | 0.3251958507 |
| Mean | 1.40597841 |
| Median Absolute Deviation (MAD) | 0.28 |
| Skewness | 0.5803027379 |
| Sum | 25657.7 |
| Variance | 0.1621484083 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1.15 | 202 | 1.1% | |
| 1.18 | 199 | 1.1% | |
| 1.08 | 194 | 1.1% | |
| 1.26 | 193 | 1.1% | |
| 1.13 | 192 | 1.1% | |
| 0.98 | 189 | 1.0% | |
| 1.19 | 188 | 1.0% | |
| 1.36 | 187 | 1.0% | |
| 1.59 | 186 | 1.0% | |
| 0.99 | 185 | 1.0% | |
| Other values (249) | 16334 | 89.5% |
| Value | Count | Frequency (%) | |
| 0.44 | 1 | < 0.1% | |
| 0.46 | 1 | < 0.1% | |
| 0.48 | 1 | < 0.1% | |
| 0.49 | 2 | < 0.1% | |
| 0.51 | 5 | < 0.1% |
| Value | Count | Frequency (%) | |
| 3.25 | 1 | < 0.1% | |
| 3.17 | 1 | < 0.1% | |
| 3.12 | 1 | < 0.1% | |
| 3.05 | 1 | < 0.1% | |
| 3.04 | 1 | < 0.1% |
| Distinct | 17137 |
|---|---|
| Distinct (%) | 93.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 850643.523 |
|---|---|
| Minimum | 84 |
| Maximum | 62505646 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 84 |
|---|---|
| 5-th percentile | 2371.8 |
| Q1 | 10838 |
| median | 107376 |
| Q3 | 432962 |
| 95-th percentile | 3716314.8 |
| Maximum | 62505646 |
| Range | 62505562 |
| Interquartile range (IQR) | 422124 |
Descriptive statistics
| Standard deviation | 3453545.36 |
|---|---|
| Coefficient of variation (CV) | 4.059920832 |
| Kurtosis | 92.10445761 |
| Mean | 850643.523 |
| Median Absolute Deviation (MAD) | 102962 |
| Skewness | 9.007687467 |
| Sum | 1.552339365e+10 |
| Variance | 1.192697555e+13 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 5676 | 5 | < 0.1% | |
| 1200 | 5 | < 0.1% | |
| 2478 | 5 | < 0.1% | |
| 1643 | 4 | < 0.1% | |
| 7422 | 4 | < 0.1% | |
| 4984 | 4 | < 0.1% | |
| 3885 | 4 | < 0.1% | |
| 8676 | 4 | < 0.1% | |
| 3614 | 4 | < 0.1% | |
| 2807 | 4 | < 0.1% | |
| Other values (17127) | 18206 | 99.8% |
| Value | Count | Frequency (%) | |
| 84 | 1 | < 0.1% | |
| 379 | 1 | < 0.1% | |
| 385 | 1 | < 0.1% | |
| 419 | 1 | < 0.1% | |
| 472 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 62505646 | 1 | < 0.1% | |
| 61034457 | 1 | < 0.1% | |
| 52288697 | 1 | < 0.1% | |
| 47293921 | 1 | < 0.1% | |
| 46324529 | 1 | < 0.1% |
| Distinct | 12877 |
|---|---|
| Distinct (%) | 70.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 293007.9443 |
|---|---|
| Minimum | 0 |
| Maximum | 22743616 |
| Zeros | 242 |
| Zeros (%) | 1.3% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 854 |
| median | 8645 |
| Q3 | 111020 |
| 95-th percentile | 1263359.6 |
| Maximum | 22743616 |
| Range | 22743616 |
| Interquartile range (IQR) | 110166 |
Descriptive statistics
| Standard deviation | 1264989.081 |
|---|---|
| Coefficient of variation (CV) | 4.31725182 |
| Kurtosis | 86.80911253 |
| Mean | 293007.9443 |
| Median Absolute Deviation (MAD) | 8617 |
| Skewness | 8.648219758 |
| Sum | 5347101975 |
| Variance | 1.600197375e+12 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 242 | 1.3% | |
| 1 | 67 | 0.4% | |
| 2 | 53 | 0.3% | |
| 6 | 52 | 0.3% | |
| 7 | 47 | 0.3% | |
| 3 | 46 | 0.3% | |
| 4 | 43 | 0.2% | |
| 9 | 36 | 0.2% | |
| 5 | 36 | 0.2% | |
| 8 | 35 | 0.2% | |
| Other values (12867) | 17592 | 96.4% |
| Value | Count | Frequency (%) | |
| 0 | 242 | 1.3% | |
| 1 | 67 | 0.4% | |
| 2 | 53 | 0.3% | |
| 3 | 46 | 0.3% | |
| 4 | 43 | 0.2% |
| Value | Count | Frequency (%) | |
| 22743616 | 1 | < 0.1% | |
| 21620180 | 1 | < 0.1% | |
| 18933038 | 1 | < 0.1% | |
| 17787611 | 1 | < 0.1% | |
| 17076650 | 1 | < 0.1% |
| Distinct | 14985 |
|---|---|
| Distinct (%) | 82.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 295154.0807 |
|---|---|
| Minimum | 0 |
| Maximum | 20470572 |
| Zeros | 61 |
| Zeros (%) | 0.3% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 103 |
| Q1 | 3008 |
| median | 29061 |
| Q3 | 150206 |
| 95-th percentile | 1303657.2 |
| Maximum | 20470572 |
| Range | 20470572 |
| Interquartile range (IQR) | 147198 |
Descriptive statistics
| Standard deviation | 1204120.403 |
|---|---|
| Coefficient of variation (CV) | 4.079633254 |
| Kurtosis | 91.94902186 |
| Mean | 295154.0807 |
| Median Absolute Deviation (MAD) | 28522 |
| Skewness | 8.942465602 |
| Sum | 5386266818 |
| Variance | 1.449905944e+12 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 61 | 0.3% | |
| 5 | 29 | 0.2% | |
| 16 | 18 | 0.1% | |
| 11 | 17 | 0.1% | |
| 2 | 17 | 0.1% | |
| 8 | 15 | 0.1% | |
| 41 | 15 | 0.1% | |
| 3 | 14 | 0.1% | |
| 44 | 13 | 0.1% | |
| 65 | 13 | 0.1% | |
| Other values (14975) | 18037 | 98.8% |
| Value | Count | Frequency (%) | |
| 0 | 61 | 0.3% | |
| 1 | 12 | 0.1% | |
| 2 | 17 | 0.1% | |
| 3 | 14 | 0.1% | |
| 4 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 20470572 | 1 | < 0.1% | |
| 20445501 | 1 | < 0.1% | |
| 20328161 | 1 | < 0.1% | |
| 18956479 | 1 | < 0.1% | |
| 17896391 | 1 | < 0.1% |
| Distinct | 7125 |
|---|---|
| Distinct (%) | 39.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22839.39673 |
|---|---|
| Minimum | 0 |
| Maximum | 2546439 |
| Zeros | 5498 |
| Zeros (%) | 30.1% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 184 |
| Q3 | 6243 |
| 95-th percentile | 106156 |
| Maximum | 2546439 |
| Range | 2546439 |
| Interquartile range (IQR) | 6243 |
Descriptive statistics
| Standard deviation | 107464.0369 |
|---|---|
| Coefficient of variation (CV) | 4.705204701 |
| Kurtosis | 132.5635595 |
| Mean | 22839.39673 |
| Median Absolute Deviation (MAD) | 184 |
| Skewness | 10.15940119 |
| Sum | 416796151 |
| Variance | 1.154851922e+10 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 5498 | 30.1% | |
| 1 | 132 | 0.7% | |
| 3 | 122 | 0.7% | |
| 2 | 101 | 0.6% | |
| 4 | 90 | 0.5% | |
| 6 | 80 | 0.4% | |
| 8 | 72 | 0.4% | |
| 9 | 72 | 0.4% | |
| 7 | 67 | 0.4% | |
| 12 | 66 | 0.4% | |
| Other values (7115) | 11949 | 65.5% |
| Value | Count | Frequency (%) | |
| 0 | 5498 | 30.1% | |
| 1 | 132 | 0.7% | |
| 2 | 101 | 0.6% | |
| 3 | 122 | 0.7% | |
| 4 | 90 | 0.5% |
| Value | Count | Frequency (%) | |
| 2546439 | 1 | < 0.1% | |
| 1993645 | 1 | < 0.1% | |
| 1896149 | 1 | < 0.1% | |
| 1880231 | 1 | < 0.1% | |
| 1811090 | 1 | < 0.1% |
| Distinct | 15963 |
|---|---|
| Distinct (%) | 87.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 239638.7163 |
|---|---|
| Minimum | 0 |
| Maximum | 19373134 |
| Zeros | 15 |
| Zeros (%) | 0.1% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 628.8 |
| Q1 | 5088 |
| median | 39743 |
| Q3 | 110783 |
| 95-th percentile | 1005478.2 |
| Maximum | 19373134 |
| Range | 19373134 |
| Interquartile range (IQR) | 105695 |
Descriptive statistics
| Standard deviation | 986242.3999 |
|---|---|
| Coefficient of variation (CV) | 4.115538655 |
| Kurtosis | 112.2721574 |
| Mean | 239638.7163 |
| Median Absolute Deviation (MAD) | 37300 |
| Skewness | 9.756071704 |
| Sum | 4373166933 |
| Variance | 9.726740714e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 15 | 0.1% | |
| 266 | 8 | < 0.1% | |
| 923 | 7 | < 0.1% | |
| 413 | 7 | < 0.1% | |
| 916 | 6 | < 0.1% | |
| 880 | 6 | < 0.1% | |
| 326 | 6 | < 0.1% | |
| 841 | 6 | < 0.1% | |
| 884 | 6 | < 0.1% | |
| 990 | 6 | < 0.1% | |
| Other values (15953) | 18176 | 99.6% |
| Value | Count | Frequency (%) | |
| 0 | 15 | 0.1% | |
| 3 | 4 | < 0.1% | |
| 6 | 4 | < 0.1% | |
| 7 | 2 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 19373134 | 1 | < 0.1% | |
| 16394524 | 1 | < 0.1% | |
| 16298296 | 1 | < 0.1% | |
| 15972492 | 1 | < 0.1% | |
| 15804696 | 1 | < 0.1% |
| Distinct | 14913 |
|---|---|
| Distinct (%) | 81.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 182194.2177 |
|---|---|
| Minimum | 0 |
| Maximum | 13384586 |
| Zeros | 159 |
| Zeros (%) | 0.9% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 256 |
| Q1 | 2849 |
| median | 26362 |
| Q3 | 83337 |
| 95-th percentile | 768146.8 |
| Maximum | 13384586 |
| Range | 13384586 |
| Interquartile range (IQR) | 80488 |
Descriptive statistics
| Standard deviation | 746178.5104 |
|---|---|
| Coefficient of variation (CV) | 4.095511482 |
| Kurtosis | 107.0128857 |
| Mean | 182194.2177 |
| Median Absolute Deviation (MAD) | 25599 |
| Skewness | 9.540660024 |
| Sum | 3324862278 |
| Variance | 5.567823694e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 159 | 0.9% | |
| 223 | 14 | 0.1% | |
| 203 | 14 | 0.1% | |
| 40 | 13 | 0.1% | |
| 123 | 11 | 0.1% | |
| 6 | 11 | 0.1% | |
| 533 | 10 | 0.1% | |
| 20 | 10 | 0.1% | |
| 103 | 10 | 0.1% | |
| 3 | 10 | 0.1% | |
| Other values (14903) | 17987 | 98.6% |
| Value | Count | Frequency (%) | |
| 0 | 159 | 0.9% | |
| 2 | 8 | < 0.1% | |
| 3 | 10 | 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 13384586 | 1 | < 0.1% | |
| 12567155 | 1 | < 0.1% | |
| 12540327 | 1 | < 0.1% | |
| 11712807 | 1 | < 0.1% | |
| 11392828 | 1 | < 0.1% |
| Distinct | 10486 |
|---|---|
| Distinct (%) | 57.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 54337.66541 |
|---|---|
| Minimum | 0 |
| Maximum | 5719096 |
| Zeros | 2371 |
| Zeros (%) | 13.0% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 127 |
| median | 2647 |
| Q3 | 22029 |
| 95-th percentile | 195699.2 |
| Maximum | 5719096 |
| Range | 5719096 |
| Interquartile range (IQR) | 21902 |
Descriptive statistics
| Standard deviation | 243965.9461 |
|---|---|
| Coefficient of variation (CV) | 4.489812808 |
| Kurtosis | 117.9994984 |
| Mean | 54337.66541 |
| Median Absolute Deviation (MAD) | 2647 |
| Skewness | 9.796455521 |
| Sum | 991608056 |
| Variance | 5.951938285e+10 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 2371 | 13.0% | |
| 3 | 238 | 1.3% | |
| 6 | 134 | 0.7% | |
| 4 | 77 | 0.4% | |
| 10 | 76 | 0.4% | |
| 5 | 59 | 0.3% | |
| 8 | 57 | 0.3% | |
| 13 | 54 | 0.3% | |
| 2 | 48 | 0.3% | |
| 26 | 41 | 0.2% | |
| Other values (10476) | 15094 | 82.7% |
| Value | Count | Frequency (%) | |
| 0 | 2371 | 13.0% | |
| 1 | 18 | 0.1% | |
| 2 | 48 | 0.3% | |
| 3 | 238 | 1.3% | |
| 4 | 77 | 0.4% |
| Value | Count | Frequency (%) | |
| 5719096 | 1 | < 0.1% | |
| 4324231 | 1 | < 0.1% | |
| 4081397 | 1 | < 0.1% | |
| 4023485 | 1 | < 0.1% | |
| 3988101 | 1 | < 0.1% |
| Distinct | 3577 |
|---|---|
| Distinct (%) | 19.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3106.279029 |
|---|---|
| Minimum | 0 |
| Maximum | 551693 |
| Zeros | 12048 |
| Zeros (%) | 66.0% |
| Memory size | 142.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 132 |
| 95-th percentile | 12058.2 |
| Maximum | 551693 |
| Range | 551693 |
| Interquartile range (IQR) | 132 |
Descriptive statistics
| Standard deviation | 17692.83749 |
|---|---|
| Coefficient of variation (CV) | 5.695830066 |
| Kurtosis | 233.6046317 |
| Mean | 3106.279029 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 13.13982169 |
| Sum | 56686486 |
| Variance | 313036498.3 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 12048 | 66.0% | |
| 3 | 74 | 0.4% | |
| 2 | 69 | 0.4% | |
| 6 | 55 | 0.3% | |
| 1 | 55 | 0.3% | |
| 7 | 52 | 0.3% | |
| 5 | 49 | 0.3% | |
| 8 | 44 | 0.2% | |
| 4 | 40 | 0.2% | |
| 15 | 37 | 0.2% | |
| Other values (3567) | 5726 | 31.4% |
| Value | Count | Frequency (%) | |
| 0 | 12048 | 66.0% | |
| 1 | 55 | 0.3% | |
| 2 | 69 | 0.4% | |
| 3 | 74 | 0.4% | |
| 4 | 40 | 0.2% |
| Value | Count | Frequency (%) | |
| 551693 | 1 | < 0.1% | |
| 454343 | 1 | < 0.1% | |
| 390478 | 1 | < 0.1% | |
| 387400 | 1 | < 0.1% | |
| 377661 | 1 | < 0.1% |
type
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 17.9 KiB |
| conventional | |
|---|---|
| organic |
| Value | Count | Frequency (%) | |
| conventional | 9126 | 50.0% | |
| organic | 9123 | 50.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 9.500410981 |
| Min length | 7 |
year
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 18.0 KiB |
| 2017 | |
|---|---|
| 2016 | |
| 2015 | |
| 2018 |
| Value | Count | Frequency (%) | |
| 2017 | 5722 | 31.4% | |
| 2016 | 5616 | 30.8% | |
| 2015 | 5615 | 30.8% | |
| 2018 | 1296 | 7.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
| Distinct | 54 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.7 KiB |
| Nashville | 338 |
|---|---|
| Detroit | 338 |
| MiamiFtLauderdale | 338 |
| Louisville | 338 |
| LosAngeles | 338 |
| Other values (49) |
| Value | Count | Frequency (%) | |
| Nashville | 338 | 1.9% | |
| Detroit | 338 | 1.9% | |
| MiamiFtLauderdale | 338 | 1.9% | |
| Louisville | 338 | 1.9% | |
| LosAngeles | 338 | 1.9% | |
| LasVegas | 338 | 1.9% | |
| Jacksonville | 338 | 1.9% | |
| Indianapolis | 338 | 1.9% | |
| Houston | 338 | 1.9% | |
| HartfordSpringfield | 338 | 1.9% | |
| Other values (44) | 14869 | 81.5% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 19 |
|---|---|
| Median length | 9 |
| Mean length | 10.29535865 |
| Min length | 4 |
month
Categorical
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 18.2 KiB |
| 1 | |
|---|---|
| 3 | |
| 2 | |
| 10 | |
| 7 | |
| Other values (7) |
| Value | Count | Frequency (%) | |
| 1 | 1944 | 10.7% | |
| 3 | 1836 | 10.1% | |
| 2 | 1728 | 9.5% | |
| 10 | 1512 | 8.3% | |
| 7 | 1512 | 8.3% | |
| 5 | 1512 | 8.3% | |
| 11 | 1404 | 7.7% | |
| 8 | 1404 | 7.7% | |
| 4 | 1404 | 7.7% | |
| 12 | 1403 | 7.7% | |
| Other values (2) | 2590 | 14.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.236670502 |
| Min length | 1 |
day
Categorical
| Distinct | 31 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.3 KiB |
| 4 | 756 |
|---|---|
| 11 | 756 |
| 18 | 755 |
| 25 | 755 |
| 19 | 648 |
| Other values (26) |
| Value | Count | Frequency (%) | |
| 4 | 756 | 4.1% | |
| 11 | 756 | 4.1% | |
| 18 | 755 | 4.1% | |
| 25 | 755 | 4.1% | |
| 19 | 648 | 3.6% | |
| 8 | 648 | 3.6% | |
| 10 | 648 | 3.6% | |
| 12 | 648 | 3.6% | |
| 15 | 648 | 3.6% | |
| 3 | 648 | 3.6% | |
| Other values (21) | 11339 | 62.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 1.710066305 |
| Min length | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | date | averageprice | total volume | 4046 | 4225 | 4770 | total bags | small bags | large bags | xlarge bags | type | year | region | month | day | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2015-12-27 | 1.33 | 64236 | 1036 | 54454 | 48 | 8696 | 8603 | 93 | 0 | conventional | 2015 | Albany | 12 | 27 |
| 1 | 1 | 2015-12-20 | 1.35 | 54876 | 674 | 44638 | 58 | 9505 | 9408 | 97 | 0 | conventional | 2015 | Albany | 12 | 20 |
| 2 | 2 | 2015-12-13 | 0.93 | 118220 | 794 | 109149 | 130 | 8145 | 8042 | 103 | 0 | conventional | 2015 | Albany | 12 | 13 |
| 3 | 3 | 2015-12-06 | 1.08 | 78992 | 1132 | 71976 | 72 | 5811 | 5677 | 133 | 0 | conventional | 2015 | Albany | 12 | 6 |
| 4 | 4 | 2015-11-29 | 1.28 | 51039 | 941 | 43838 | 75 | 6183 | 5986 | 197 | 0 | conventional | 2015 | Albany | 11 | 29 |
| 5 | 5 | 2015-11-22 | 1.26 | 55979 | 1184 | 48067 | 43 | 6683 | 6556 | 127 | 0 | conventional | 2015 | Albany | 11 | 22 |
| 6 | 6 | 2015-11-15 | 0.99 | 83453 | 1368 | 73672 | 93 | 8318 | 8196 | 122 | 0 | conventional | 2015 | Albany | 11 | 15 |
| 7 | 7 | 2015-11-08 | 0.98 | 109428 | 703 | 101815 | 80 | 6829 | 6266 | 562 | 0 | conventional | 2015 | Albany | 11 | 8 |
| 8 | 8 | 2015-11-01 | 1.02 | 99811 | 1022 | 87315 | 85 | 11388 | 11104 | 283 | 0 | conventional | 2015 | Albany | 11 | 1 |
| 9 | 9 | 2015-10-25 | 1.07 | 74338 | 842 | 64757 | 113 | 8625 | 8061 | 564 | 0 | conventional | 2015 | Albany | 10 | 25 |
Last rows
| df_index | date | averageprice | total volume | 4046 | 4225 | 4770 | total bags | small bags | large bags | xlarge bags | type | year | region | month | day | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 18239 | 2 | 2018-03-11 | 1.56 | 22128 | 2162 | 3194 | 8 | 16762 | 16510 | 252 | 0 | organic | 2018 | WestTexNewMexico | 3 | 11 |
| 18240 | 3 | 2018-03-04 | 1.54 | 17393 | 1832 | 1905 | 0 | 13655 | 13401 | 253 | 0 | organic | 2018 | WestTexNewMexico | 3 | 4 |
| 18241 | 4 | 2018-02-25 | 1.57 | 18421 | 1974 | 2482 | 0 | 13964 | 13698 | 266 | 0 | organic | 2018 | WestTexNewMexico | 2 | 25 |
| 18242 | 5 | 2018-02-18 | 1.56 | 17597 | 1892 | 1928 | 0 | 13776 | 13553 | 223 | 0 | organic | 2018 | WestTexNewMexico | 2 | 18 |
| 18243 | 6 | 2018-02-11 | 1.57 | 15986 | 1924 | 1368 | 0 | 12693 | 12437 | 256 | 0 | organic | 2018 | WestTexNewMexico | 2 | 11 |
| 18244 | 7 | 2018-02-04 | 1.63 | 17074 | 2046 | 1529 | 0 | 13498 | 13066 | 431 | 0 | organic | 2018 | WestTexNewMexico | 2 | 4 |
| 18245 | 8 | 2018-01-28 | 1.71 | 13888 | 1191 | 3431 | 0 | 9264 | 8940 | 324 | 0 | organic | 2018 | WestTexNewMexico | 1 | 28 |
| 18246 | 9 | 2018-01-21 | 1.87 | 13766 | 1191 | 2452 | 727 | 9394 | 9351 | 42 | 0 | organic | 2018 | WestTexNewMexico | 1 | 21 |
| 18247 | 10 | 2018-01-14 | 1.93 | 16205 | 1527 | 2981 | 727 | 10969 | 10919 | 50 | 0 | organic | 2018 | WestTexNewMexico | 1 | 14 |
| 18248 | 11 | 2018-01-07 | 1.62 | 17489 | 2894 | 2356 | 224 | 12014 | 11988 | 26 | 0 | organic | 2018 | WestTexNewMexico | 1 | 7 |